Active Subtopic Detection in Multitopic Data

نویسندگان

  • Benjamin Bergner
  • Georg Krempl
چکیده

Subtopic detection is a useful text information retrieval tool to create relations between documents and to add descriptive information to them. For the task of detecting subtopics with user guidance, clustering by intent (CBI) has recently been proposed. However, this approach is limited to single-topic environments. We extend this approach for interactive subtopic detection in multi-topic environments, and for the incorporation of positive and negative user feedback. Our multi-topic clustering by intent (MCBI) approach iteratively constructs so-called similarity sets of documents within the same topic, derives candidates for new subtopics and actively queries feedback from the user, which is then used to refine the subtopic and similarity sets in the next iteration. For evaluation, we construct a corpus of the Wikipedia articles for the 4309 most common English nouns, comprising a broad range of different topics. Our MCBI approach is compared with the recently proposed CBI approach and random sampling. Each approach is evaluated based on the number of subtopics that are found in the same predefined, closed topic (countries). The results show that MCBI finds up to 137% and 445% percent more correct subtopics than random term selection or CBI, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Synthesis and catalytic activity of heterogeneous rare-earth metal catalysts coordinated with multitopic Schiff-base ligands.

Four multitopic Schiff-base ligand precursors were synthesized via condensation of 4,4'-diol-3,3'-diformyl-1,1'-diphenyl or 1,3,5-tris(4-hydroxy-5-formylphenyl)benzene with 2,6-diisopropylaniline or 2,6-dimethylaniline. Amine elimination reactions of Ln[N(SiMe(3))(2)](3) (Ln = La, Nd, Sm or Y) with these multitopic ligand precursors gave ten heterogeneous rare-earth metal catalysts. These heter...

متن کامل

Mobile information retrieval with search results clustering: Prototypes and evaluations

Web searches from mobile devices such as PDAs and cell phones are becoming increasingly popular. However, the traditional list-based search interface paradigm does not scale well to mobile devices due to their inherent limitations. In this article, we investigate the application of search results clustering, used with some success for desktop computer searches, to the mobile scenario. Building ...

متن کامل

Udel @ NTCIR-11 IMine Track

This paper describes our participation in the Intent Mining track of NTCIR-11. We present our methods and results for both document ranking and subtopic mining. Our ranking methods are based on several data fusion techniques with some variations. Our subtopic mining method is a very simple technique that uses query dimensions’ items to form a subtopic

متن کامل

SEM12 at the NTCIR-10 INTENT-2 English Subtopic Mining Subtask

Users express their information needs in terms of queries in search engines to find some relevant documents on the Internet. However, search queries are usually short, ambiguous and/or underspecified. To understand user’s search intent, subtopic mining plays an important role and has attracted attention in the recent years. In this paper, we describe our approach to identifying, and then rankin...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016